63 - NHR PerfLab Seminar 2024-02-27: Performance of linear solvers in tensor-train format on current multicore architectures [ID:51607]
50 von 349 angezeigt

So as Björk said, I'm from the Institute for Software Technology from the DLR and part

of our group is working on linear algebra methods.

Today I'm looking at some of our work in tensor linear algebra and the main goal or main

message I want to talk about is how to map more complex high level algorithms onto lower

level building blocks today and the highly related talk from the series from last year's

from Professor Pentenezi on linear algebra mapping problem.

So he's looking on that in a high level way on linear algebra operations and I'm just

picking two examples where in tensor algorithms they have more even more freedom to change

things.

So first of all, I want to consider two examples.

The first one is a bit simpler.

That's why it's sort of a main topic.

You can start with just some large dense array and you want to calculate a low rank

approximation of that.

I will come to the format in a minute.

You are either given tolerance or like the maximum rank that you want to have.

You are looking for the best or close to the best approximation inside the tolerance in

a specific 10-gallon format.

The other high level algorithm I want to look at is solving a linear system in that setting.

So I'm giving a linear operator already in a low rank format and similarly a right hand

side vector and the tolerance and iteratively want to determine a solution to that all without

blowing up the problem when going to like a fully dense problem that's always remaining

in this low rank approximation of the data.

Some background there.

I had tensor train format in the slides in physics.

It's now much longer as matrix product states and here it's used in a slightly different

setting but very close to that.

And I tried to find some notation to show how this works.

So for d-dimensional tensor you have d smaller sub-tensors.

Each of them is three dimensional and you always multiply like the last dimension of

one sub-tensor to the first dimension of the next sub-tensor.

That's what I wanted to show with this product here.

So it's like defined as a contraction between those.

And this gives in two dimensions.

It would be just a matrix matrix multiplication and in higher dimensions you can approximate

with this list of tensors a much bigger Lafayette order tensor.

And yeah the all the dimensions in the middle are known as bond dimensions in physics or

just grains in linear algebra.

And the next thing that you can construct with that is an operator where you take the

same idea and just have two dimensions for each of the sub-tensors that are not contracted.

And so each in the second each sub-tensor has four dimensions and the left and the right

are contracted and the two in the middle are left open.

So it defines a linear mapping between the tensor chain and another tensor chain.

For the linear systems we will just use like n to the power of b as dimensions to simplify

things.

And then I don't know if there's in physics there's a notation that's known as tensor

network notation.

I don't know if there's a specific variant like one of the papers show it slightly differently.

So that's helpful to show some of the algorithms later.

And just a dot as a scalar if you have dot with one outgoing edge as a vector.

Teil einer Videoserie :
Teil eines Kapitels:
NHR@FAU PerfLab Seminar

Zugänglich über

Offener Zugang

Dauer

00:34:40 Min

Aufnahmedatum

2024-03-27

Hochgeladen am

2024-03-01 10:56:03

Sprache

en-US

Speaker: Melven Röhrig-Zöllner, German Aerospace Center (DLR)
Title: Performance of linear solvers in tensor-train format on current multicore architectures
Date and time: Tuesday, February 27, 2024, 2:00-3:00 p.m. CET
Slides: https://hpc.fau.de/files/2024/03/NHR_PerfLab_2024-02-27_Performance-of-linear-solvers-in-TT-format.pdf
Abstract:
This talk discusses the node-level performance of numerical algorithms for handling high-dimensional problems in a compressed tensor format.
It focusses on two problems in particular: (1) approximating large (dense) data (lossy compression) and (2) solving linear systems, both in the tensor-train / matrix-product states format. For both problems, we optimize the required underlying linear algebra operations, respectively the mapping of the high-level algorithm to (potentially less accurate) lower-level operations. In particular, we suggest improvements for costly orthogonalization and truncation steps based on a high-performance implementation of a “Q-less” tall-skinny QR decomposition. Further optimizations for solving linear systems include memory layout optimizations for faster tensor contractions and a simple generic preconditioner. We show performance results on today’s multi-core CPUs where we obtain a speedup of up ~50x over the reference implementation for the lossy compression, and up to ~5x for solving linear systems.
Short Bio:
Melven Röhrig-Zöllner studied Computational Engineering Science at the RWTH Aachen. Since 2014, he works as a researcher in the HPC department of the Institute for Software Technology of the German Aerospace Center (DLR). His research focuses on the performance of numerical methods, in particular in the field of numerical linear algebra. He also supports scientific software development for HPC systems in the DLR, e.g., concerning software engineering practices and testing parallel codes.
Einbetten
Wordpress FAU Plugin
iFrame
Teilen